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I. 


INTRODUCTION 


This paper describes efforts by the National Aeronautics and Space 
Administration/National Space Technology Laboratories, Earth Resources 
Laboratory (ERL), and the United States Department of Agriculture, Eco- 
nomics and Statistics Service (ESS), to investigate techniques of processing 
various Landsat data sets for the purposes of land cover classification 
and area estimation. A Missouri study site comprising a single Landsat 
scene was selected. Ground-gathered and Landsat data were synthesized 
and analyzed on both the ERL and ESS computer systems. This study was 
not designed to compare these two systems but rather to evaluate different 
analytical tasks and procedures and their effect on the results obtained 
from Landsat classifications. 

The objectives of this study were to: 

• Determine classification and estimation differences between 
uni temporal and multi temporal analysis. 

• Determine classification and estimation differences using all 
multispectral scanner (MSS) bands, various subsets, and trans- 
formed MSS data. 

• Evaluate land cover estimates derived from EDITOR regression 
methods . 

• Evaluate the adequacy of June Enumerative Survey (JES) segment 
data for representing the spectral diversity of all land cover 
types. 

• Evaluate the effect of misregistered multitemporal data in 
classification results. 

Methods and results of the investigation are discussed in the 


following sections. 


1 



II. DATA SOURCES 


The study site included II counties in north central Missouri. All 
ground and Landsat.data used in the study were collected during the 1979 
growing season. 

f 

A. Ground Data 

Thirty-three ESS June Enumerative Survey sample segments were located 
throughout the 11-county area. The crop or land use was recorded for all 
land within each segment, typically 2.5 square kilometers in size. During 
June a trained enumerator delineated the land cover information for each 
segment on an aerial photograph. Segments from this JES sample were then 
registered to a base map and all field boundaries were digitized and trans- 
formed into latitude-longitude coordinates. 

B. Landsat Data 

Landsat MSS data over path 28/ row 32 of the Worldwide Reference 
System were obtained for May 14, August 3, and September 17. Efficient 
utilization of Landsat data requires knowing the geographic location of 
each pixel within the scene. Landsat row-column coordinates were related 
to map latitude-longitude or UTM coordinates by scene- to-map registration. 
The major components of this map registration technique are discussed by 
Hanuschak, et al . (1979), and Joyce, et al . (1980). Results indicate the 
registration accuracy of an entire scene to be within one pixel for the 
57 X 57-meter pixel size of P-format Landsat data. 

To conduct multi temporal analysis, the Landsat images had to be 
registered to each other. Several different algorithms and procedures 
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have been developed to perform scene- to-scene registration of Landsat 
images (Anuta, 1970, 1977; Joyce, et al., 1980). In each procedure one 
scene was selected as the base frame and a second scene was registered 
to this base. In this study the base frame was August 3. ESS techniques 
were used to register the May 14 scene to this base and NSTL/ERL proce- 
dures were used to register the May 14 and September 17 data to the base. 

C. Synthesis of Ground and Landsat Data 

In order to simultaneously use the ground and Landsat data during 
computer analysis, the exact location of the field and segment boundaries 
within the Landsat data had to be determined. The first step of this 
procedure was to produce a gray scale map of a window containing the pre- 
dicted area of the segment. Using the digitized segment files, plots 
of the segment ground data were made at the same scale as the gray scale 
maps. Each plot was overlaid on the gray scale map and shifted until the 
field boundaries best fit the field patterns of the map. The new coordi- 
nates of the segment were recorded in a computer file containing the pre- 
cision registration of segment ground data to Landsat data. 

For every Landsat pixel falling within a segment there is a corre- 
sponding ground cover data point. This registration technique permits 
the identification of boundary pixels which can be eliminated from consi- 
deration during training and classification. Further details of these 
techniques are discussed by Ozga and Donovan, 1977. 

III. DATA PROCESSING 

Analysis and processing were performed on both USDA/ESS and NASA/ 
NSTL/ERL facilities. The ESS EDITOR system (Ozga and Donovan, 1977) was 
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used for photo and map digitization, scene-to-scene registration, Landsat 
analysis of sample segments and full scenes, and calculation of regression 
estimates of land cover types. These processes were executed by using 
purchased computer time on a PDP-10 in Cambridge, Mass., and the Illiac 
IV computer in Sunnyvale, Calif. 

The Earth Resources Laboratory Applications Software (ELAS) was 
used at NASA/NSTL to perform scene-to-scene registration, analyze segment 
and full scene data for the various land cover types, and for examining 
misregistration effectSo ELAS is a comprehensive operating subsystem, writ- 
ten principally in FORTRAN language, for processing and analyzing digital 
imagery data. A Perkin-Elmer 3242 computer was used for all analyses. 

All processing was done using a four-category data set. The numbers 
of pure field interior pixels for each category contained within the 33 
segments were: corn, 1,098; soybeans, 2,138; dense woodland, 559; and 

hay/permanent pasture, 3,580 (Table 1). Training statistics were derived 
from, and accuracy testing was performed on, the same set of pixels in a 
method known as resubstitution. 

IV. EVALUATION OF CLASSIFICATION PERFORMANCE USING UNITEMPORAL, MULTI- 

TEMPORAL, AND TRANSFORMED LANDSAT DATA SETS 

A. EDITOR Analysis 

Training statistics were developed by clustering the field interior 
Landsat pixels within the 33 segments for each of the four categories. 

The iterative clustering algorithm was set up according to the parameters 
given in Table 2. The May four-channel, August fou«^-channel and May/August 
eight-channel data sets were clustered using these parameters. Treating 
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TABLE 1. NUMBER OF SAMPLE FIELDS BY COVER TYPE FOR 33 JES SEGMENTS FROM 


NORTH CENTRAL MISSOURI 


Cover Type 
Category 

■ 

Number of 
Fields 

Mean Field 
Size (ha) 

Total 

Pixels 

Non-Border 
Pixels ' 

Percent ' 
Non-Border 
Pixels 

Corn 

51 

10.3 

1,515 

1,098 

67.9 

Soybeans 

117 

9.1 

3,277 

2,138 

65.2 

Hay/Permanent 

Pasture 

134 

11.5 

4,751 

3,580 

75.4 1 

Dense Woodland 

35 

10.0 

1,076 

559 

52.0 


TABLE 2. CLUSTER PARAMETERS FOR EDITOR ANALYSIS 


Cover Type 
Category 


Initial No. 
of Clusters 

Final No. ' 
of Clustersl 

Corn 

0.75 

16 

- 1 

Soybeans 

0.75 

16 

13 

Dense Woodland 

0.75 

8 

6 

Hay/Permanent Pasture 

0.75 

16 

13 


Percent Convergence = 97 













each data set the same ensured that major differences in clustering and 
classification results were due mainly to differences between the three 
data sets. 

Training statistics obtained for each of the four categories, using 
all 33 segments, were input to a maximum likelihood classification algorithm 
on the Illiac IV. The same default parameters were used to classify each of 
the three data sets. The percent correct classification (PCC), commission 
errors, and a breakdown of computer time are given in Table 3. A one-way 
analysis of variance, with arcsin transformation and Newman-Keuls 
Range Test (Steel and Torrie, 1960) was conducted to determine differences 
in the classification results. At the 10% level, the overall PCC of the 
May/August data set was significantly greater than the overall PCC's of 
either the May or August uni temporal sets. The computer time required 
to process eight channels of data was slightly less than twice the time 
for processing a single four-channel data set. 

The Kauth Thomas transformation (Kauth, et al . , 1978) was applied 
to the May four-channel and August four-channel data sets. The brightness 
and greenness components from these two transformed sets were combined to 
give a new four-channel data set. A second multi temporal data set was 
obtained by combining channels 5 and 7 from the May and August raw data. 

These two data sets were clustered using the parameters given in 
Table 2. The classifications were obtained using the default parameters 
for the EDITOR algorithm. The results are given in Table 4; for compari- 
son purposes. Table 4 also shows the eight-channel results reported in 
Table 3. 
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TABLE 3. EDITOR CLASSIFICATION RESULTS OF MAY AND AUGUST SINGLE-DATE AND 


MAY/ AUGUST OVERLAID DATA SETS 


Cover Type 
Category 

May 4-Channel 

Aug 4-Channel 

May /Aug 
8-Channel 

Corn 




PCC 

53.4 

64.1 

74.7 

Commission Errors 

63.1 

58.9 

24.9 

Soybean 




PCC 

51.3 

62.8 

76.8 

Commission Errors 

38.4 

28.2 

23.9 

Hay/ Permanent Pasture 

/ 



PCC 

' 68.1 

65.2 

76.4 

Commission Errors 

23.1 

28.4 

21.1 

Dense Woodland 




PCC 

48.4 

35.3 

56.7 

Commission Errors 

66.2 

65.1 

49.3 

Overall 




PCC 

58.6 

61.2 

74.2 

Commission Errors 

41.4 

39.6 

26.3 

Computer Time (seconds) 




Cluster (PDP-IO) 

620 

677 

1,131 

Classify (Illiac IV) 

2 

3 

6 

Total 

622 

680 

1,137 
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TABLE 4. EDITOR CLASSIFICATION RESULTS OF THE MAY/AUGUST EIGHT-CHANNEL, 
MAY AND AUGUST B,G/B,G*, AND MAY AND AUGUST 5, 7/5, 7** DATA SETS 


Cover Type 
Category 

May/ August 
8-channels _ 

May & Aug 
B,G/B,G 


Corn 

- 



PCC 

74.7 

73.3 

70.9 

Commission Errors 

24.9 

31.8 

36.5 

Soybean 




PCC 

76.8 

75.1 

71.0 

Commission Errors 

23.9 

27.5 

27.4 

Hay/Permanent Pasture 




PCC 

76.4 

73.9 

73.8 

Commission Errors 

21.1 

20.2 

20.9 

Dense Woodland 




PCC 

56.7 

54.6 

54.8 

Commission Errors 

49.3 

51.0 

54.1 

1 

Overall 




PCC 

74.2 

72.2 

70.6 

Commission Errors 

26.3 

28.4 

29.5 

Computer Time (seconds) 




Cluster (PDP-10) 

f 

1,131 

641 

560 

Classify (Illiac IV) 

6 

6 

3 

Total 1 

1,137 

702* ** *** 

563 


* Brightness and greenness components of the Kauth Thomas transformation. 

** Bands 5 and 7. 

***55 seconds for transforming. 
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A one-way analysis of variance with arcsin transformation 

and Newman-Keuls Range Test was performed at the 10% level. The overall 
PCC's of each data set did not differ significantly from each other. 

However, from an operational standpoint, the classification performance 
should be compared to the cost of production. As shown in Table 4, a 
2% increase was obtained using all eight channels rather than the four- 
channel transformed data. This small improvement in classification re- 
quired 62% more CPU time. If this proves to be typical, individual users 
should determine the trade-offs between accuracy and costs. 

B. ELAS Analysis 

The same 33 JES segments were analyzed using ELAS. The within 
class cluster (WCCL) program was used with default parameters for de- 
veloping spectral class means and covariance matrices for each land cover 
category. WCCL is an unsupervised procedure which collects training 

s 

statistics on a point-by-point basis within previously defined classes 
(in this case, JES land cover categories). It uses a discard method 
to delete statistics made from four or fewer pixels that do not meet 
certain scaled distance criteria. 

Training statistics developed by WCCL are used as input to a maximum 
likelihood classification program, WMAX. A pixel -by-pixel tally of the 
maximum likelihood classification with corresponding JES land cover identi- 
fication provided the basis for calculation of percent correct classifica- 
tion and commission error for each Landsat data set. - As mentioned previously, 
training statistics and accuracy tabulations were developed on the same 
set of field interior (non-border) pixels. 
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Five multiband, mul ti temporal , and transformed Landsat data sets 
were analyzed using the above procedure. Classification results for these 
data sets are given in Table 5. Computer times were not compared for ELAS 
classifications. A one-way analysis of variance, followed by a Newman-Keuls 
test of significance at the 10% level, was performed on the overall percent 

correct classifications, which were transformed to arcsinv^ in order to en- 

\ 

sure normal distribution, independent means and variances, and homogeneous 
variances. The August single-date data set had the lowest overall PCC, 
while the three-date data set had the highest. However, the above test 
revealed that the overall PCC for the three-date data set was not signi- 
ficantly different from the overall PCC for the August/September data set. 
The overall PCC's of all other data sets were significantly different from 
each other. It should be noted that the May scene was not of high quality 
and had considerable haze. 

The August/September four-channel Kauth transformed data set did not 
show an improvement over the four-channel (5, 7/5, 7) data set for the same 
dates. Even though the percent correct classifications for corn and dense 
woodland were higher for the Kauth transformed data, the PCC's for soybeans 
and hay/permanent pasture (which had the largest numbers of field interior 
pixels) dropped in comparison with the data set made up only of bands 5 
and 7 for the two dates. The August/September (5, 7/5, 7) data set, based 
on its good classification of corn and soybeans, was chosen for testing 
subsequent data processing procedures. 

V. EVALUATION OF LAND COVER ESTIMATES 

A. EDITOR Regression Estimates 

The classification results shown in Table 4 were used to obtain seg- 
ment level regression estimates for each category using the ESS regression 
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TABLE 5. ELAS CLASSIFICATION RESULTS FOR FIVE MULTIBAND, MULTITEMPORAL,' AND TRANSFORMED LANDSAT 
DATA SETS FROM ANALYSIS OF 33 JES SEGMENTS FROM NORTH CENTRAL MISSOURI 



11 


greenness 















2 

methodology (Craig, et al,,1978). Table 6 contains the R and coef- 
ficient of variation (C.V.) values of these estimates. A test for 
significant differences is included in the table. 

All of the corn estimates were significantly different from each 
other. The May/August (5, 7/5, 7) corn estimate differed from all 
eight-channel and B,G/B,G estimates at the ]% confidence level. These 
differences are supported by the variability in the corn estimate C.V.'s. 

B. ELAS Large Area Spectral Class Definition 

The August/September (5, 7/5, 7) data set was used to derive homo- 

2 

geneous spectral classes for the entire 15,120 km , 11-county area. 

Spectral class training statistics were developed using the ELAS program 
SRCH, which is an unsupervised procedure for collecting training statistics 
from homogeneous fields by passing a 3 by 3 pixel window through the data 
(Joyce, et al , , 1980). For this data set, 7.5% of the total pixels 
available in the study site were selected by SRCH for development of 54 
spectral class statistics. 

The entire study site was classified using a maximum likelihood 
classification program, MAXL. A pixel -by-pixel comparison of classification 
assignments with JES segment class identification allowed for labeling 
of the spectral classes as to their predominant cover type. Thus, the 54 
training classes were combined into 7 land cover types. Certain cover 
types, such as water, were not represented in the JES segment data, while 
other cover types, such as hay and pasture, possessed more spectral variabi- 
lity than existed in the JES fields (Table 7). These spectral classes were 
labeled based on expected seasonal reflectance characteristics of water 
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TABLE 6. EDITOR SEGMENT LEVEL REGRESSION ESTIMATES USING SEVERAL MULTITEMPORAL DATA SETS 



13 


Significant at 0.05 level. 
Significant at 0.01 level. 
Not significant. 







TABLE 7. ELAS CLASSIFICATION RESULTS FOR ANALYSIS OF AUGUST/SEPTEMBER 
FOUR-CHANNEL (5, 7/5, 7) DATA SET USING SRCH-DERIVED STATISTICS 


Cover Type Category 

Spectral 

Classes 

Mode of 
Spectral Class 
Definition 

PCC 

Corn 

5 

JES Data 

69.6 

Soybeans 

18 

JES Data 

87.8 

Hay/ 

Permanent Pasture 

21 

18 Classes-JES Data 
3 Classes-VIS/IR* Plots 

72.0 

Dense Woodland 

2 

JES Data 

65.3 

Winter Wheat 

2 

JES Data 

Not Tested 

Waste 

1 

JES Data 

Not Tested 

Water 

5 

VIS/IR Plots 

Excluded from 
JES Sample Frame 

Overall 

54 

— 

75.7 


*Visible/infrared 
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and hay as displayed on plots of Landsat MSS band 5 vs. band 7 response. 

These results point to the possibility of under-representation of the 
spectral diversity among the land cover types of a large geographic area 
when segment data from only slightly more than 0.2% of the area are used 
for spectral class definition. It should be noted that the JES sample 
was 0.6%, but several segments were not included because of cloud cover. 

Reduced classification accuracy of this whole-scene classification, 
as compared with the results of analysis of only the segments themselves, 
can be attributed to the existence of "mixed" classes developed by the 
SRCH approach. Mixed classes represent cases of spectral similarity among 
different land cover types. In the SRCH procedure, each spectral class was 
defined to, represent just one land cover type even for those situations in 
which a portion of the JES segment pixels assigned to that spectral class 
belonged to other land cover types. 

VI. EVALUATION OF MISREGISTRATION BETWEEN DATA SETS 

Concern over the possible deleterious effects of pixel misregis- 
tration on classification accuracy of multitemporal data sets led to a 
study of intentional registration offsets on the August/September four- 
channel (5, 7/5, 7) data set. These two Landsat scenes had been registered 
using a manual seed point location procedure followed by computer-guided 
control point selection (Joyce, et al . , 1980) to achieve a root mean square 
(RMS) error of 49 meters for the overlaid data sets. Intentionally misregis- 
tered data sets were produced by adding 20 meters (about 1/3 pixel) and then 
30 meters (about 1/2 pixel) to the element (column) coordinate of the control 
point location for the scene being overlaid. These offsets were chosen be- 
cause the RMS error resulting from computer assisted scene-to-scene overlay 


15 



procedures seldom exceeds the dimensions of one and one-half pixels for 
good quality Landsat MSS data. 

In Table 8, classification results for the misregistered data sets 
are compared with results for the data set with no offset. Overall classi- 
fication results for the three data sets are not significantly different at 
the 10% level after transformation of PCC's to arcsin ^/p . Even with a 
30-meter offset, which caused a noticeable misregistration of ground 
features when observed on a digital display device, the overall classifica- 
tion accuracy dropped only 2%. These results confirm the observations of 
Cicone, et al . (1976), who found that the effect of misregistration is not 
a significant factor of concern in the recognition of field interior pixels 
which remain field interior after misregistration. The lack of significant 
differences in overall classification accuracies between registered and mis- 
registered data sets does not reflect the very real differences arising 
from reduced availability of pure non-border pixels and errors in proportion 
estimation of data sets containing an inflated number of mixture pixels. 

The problem of reduced availability of non-border pixels is crucial for 
cover types which, because of their field size or shape, already have low 
percentages of field interior pixels, as is the case with fields of dense 
woodland shown in Table 1. The percent correct classification for dense 
woodland dropped more than any other cover type in the misregistered data 
sets, while dense woodland also had the smallest percentage of non-border 
pixels. 
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TABLE 8. ELAS CLASSIFICATION RESULTS FOR ANALYSIS OF AUGUST/SEPTEMBER 

FOUR-CHANNEL (5, 7/5, 7) DATA SET WITH MISREGISTRATION BETWEEN DATES 


Cover Type 
Category 

1 No Offset* 1 

20-m 

Offset 

30-m 

Offset 

PCC 

Commission 

Errors 

PCC 

Commission 

Errors 

PCC 

Commission 

Errors 

Corn 

78.0 

20.9 

76.3 

24.7 

76.7 

25.7 

Soybeans 

84.9 

16.7 

84.2 

17.7 

82.8 

16.8 

Hay/ 

Permanent 

Pasture 

85.5 

14.5 

83.1 

15.3 

83.9 

16.6 

i 

Dense Woodland I 

1 

1 

60.3 

30.4 

61.2 

39.4 

56.9 

38.2 

Overall | 

1 82.3 

17.7 

80.8 

19.2 

80.4 

19.6 


*Scene-to-scene registration achieved by use of ELAS overlay technique, 
resulting in 49-m RMS error for 57 x 57-m pixel size. 
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VII. SUMMARY 


Multiband, mul ti temporal , and transformed Landsat MSS data sets were 

analyzed using pattern recognition procedures employed by the USDA Economics 

and Statistics Service and by the NASA/NSTL Earth Resources Laboratory for 

the purpose of land cover area estimation. The analyses had in common the 

use of field-verified land cover data for training and accuracy testing in 

2 

the form of 33 June Enumerative Survey segments, typically 2.5 km in size. 
Corn, soybeans, hay/permanent pasture, and dense woodland predominate in 
the landscape of the 11-county north central Missouri test area and were 
the four land cover types studied. 

Multitemporal data sets gave significantly higher classification 
accuracies than any single-date Landsat data set for data processing pro- 
cedures used by both ESS and ERL. The use of only Landsat MSS bands 5 and 
7 in multitemporal analysis showed no significant difference in overall 
classification accuracy from analysis using bands 4 and 6 in addition to 
bands 5 and 7. Transformed data sets also failed to significantly improve 
classification accuracies, but rather served as a means of reducing data 
from four to two channels per date, thus decreasing processing time. 

Segment level land cover regression estimates were obtained using 
the JES data as the dependent variable and Landsat classified results as 
the independent variable. It was found that the use of all eight channels 
for the May/August data set resulted in significantly higher correlation 
coefficient values for corn than use of four-channel Kauth transformed 
data or four-channel band 5, 7/5, 7 data. Other cover types did not show 
significant differences between data sets. 
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ELAS analysis results indicated that the spectral diversity among 
the land cover types was under-represented by the 0.2% sample. A follow- 
on study using wall-to-wall field verification data is planned to further 
define an adequate sampling scheme for total land cover mapping. 

Misregistration of two Landsat data sets by as much as 79 meters 
(about one and one-half pixels) did not significantly alter overall 
classification accuracies. Even though a noticeable offset could be ob- 
served in the position of ground features when viewed on a digital display 
device, the "effective purity" of field interior pixels apparently was 
maintained. Existing algorithms for scene-to-scene overlay are adequate 
for multitemporal data analysis as long as statistical class development 
and accuracy assessment are restricted to non-border pixels. 
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